Goto

Collaborating Authors

 neural space



Shared Neural Space: Unified Precomputed Feature Encoding for Multi-Task and Cross Domain Vision

arXiv.org Artificial Intelligence

The majority of AI models in imaging and vision are customized to perform on specific high-precision task. However, this strategy is inefficient for applications with a series of modular tasks, since each requires a mapping into a disparate latent domain. To address this inefficiency, we proposed a universal Neural Space (NS), where an encoder-decoder framework pre-computes features across vision and imaging tasks. Our encoder learns transformation aware, generalizable representations, which enable multiple downstream AI modules to share the same feature space. This architecture reduces redundancy, improves generalization across domain shift, and establishes a foundation for effecient multi-task vision pipelines. Furthermore, as opposed to larger transformer backbones, our backbone is lightweight and CNN-based, allowing for wider across hardware. We furthur demonstrate that imaging and vision modules, such as demosaicing, denoising, depth estimation and semantic segmentation can be performed efficiently in the NS.



An Investigation of Conformal Isometry Hypothesis for Grid Cells

arXiv.org Machine Learning

This paper investigates the conformal isometry hypothesis as a potential explanation for the emergence of hexagonal periodic patterns in the response maps of grid cells. The hypothesis posits that the activities of the population of grid cells form a high-dimensional vector in the neural space, representing the agent's self-position in 2D physical space. As the agent moves in the 2D physical space, the vector rotates in a 2D manifold in the neural space, driven by a recurrent neural network. The conformal isometry hypothesis proposes that this 2D manifold in the neural space is a conformally isometric embedding of the 2D physical space, in the sense that local displacements of the vector in neural space are proportional to local displacements of the agent in the physical space. Thus the 2D manifold forms an internal map of the 2D physical space, equipped with an internal metric. In this paper, we conduct numerical experiments to show that this hypothesis underlies the hexagon periodic patterns of grid cells. We also conduct theoretical analysis to further support this hypothesis. In addition, we propose a conformal modulation of the input velocity of the agent so that the recurrent neural network of grid cells satisfies the conformal isometry hypothesis automatically. To summarize, our work provides numerical and theoretical evidences for the conformal isometry hypothesis for grid cells and may serve as a foundation for further development of normative models of grid cells and beyond.


Conformal Normalization in Recurrent Neural Network of Grid Cells

arXiv.org Machine Learning

The responses of the population of grid cells collectively form a vector in a high-dimensional neural activity space, and this vector represents the self-position of the agent in the 2D physical space. As the agent moves, the vector is transformed by a recurrent neural network that takes the velocity of the agent as input. In this paper, we propose a simple and general conformal normalization of the input velocity for the recurrent neural network, so that the local displacement of the position vector in the high-dimensional neural space is proportional to the local displacement of the agent in the 2D physical space, regardless of the direction of the input velocity. Our numerical experiments on the minimally simple linear and non-linear recurrent networks show that conformal normalization leads to the emergence of the hexagon grid patterns. Furthermore, we derive a new theoretical understanding that connects conformal normalization to the emergence of hexagon grid patterns in navigation tasks.


Statistical physics, Bayesian inference and neural information processing

arXiv.org Machine Learning

Lecture notes from the course given by Professor Sara A. Solla at the Les Houches summer school on "Statistical physics of Machine Learning". The notes discuss neural information processing through the lens of Statistical Physics. Contents include Bayesian inference and its connection to a Gibbs description of learning and generalization, Generalized Linear Models as a controlled alternative to backpropagation through time, and linear and non-linear techniques for dimensionality reduction.


Interpolation, extrapolation, and local generalization in common neural networks

arXiv.org Artificial Intelligence

There has been a long history of works showing that neural networks have hard time extrapolating beyond the training set. A recent study by Balestriero et al. (2021) challenges this view: defining interpolation as the state of belonging to the convex hull of the training set, they show that the test set, either in input or neural space, cannot lie for the most part in this convex hull, due to the high dimensionality of the data, invoking the well known curse of dimensionality. Neural networks are then assumed to necessarily work in extrapolative mode. We here study the neural activities of the last hidden layer of typical neural networks. Using an autoencoder to uncover the intrinsic space underlying the neural activities, we show that this space is actually low-dimensional, and that the better the model, the lower the dimensionality of this intrinsic space. In this space, most samples of the test set actually lie in the convex hull of the training set: under the convex hull definition, the models thus happen to work in interpolation regime. Moreover, we show that belonging to the convex hull does not seem to be the relevant criteria. Different measures of proximity to the training set are actually better related to performance accuracy. Thus, typical neural networks do seem to operate in interpolation regime. Good generalization performances are linked to the ability of a neural network to operate well in such a regime.


RigNeRF: A New Deepfakes Method That Uses Neural Radiance Fields

#artificialintelligence

What you're seeing in the image above (middle image, man in blue shirt), as well as the image directly below (left image, man in blue shirt), is not a'real' video into which a small patch of'fake' face has been superimposed, but an entirely synthesized scene that exists solely as a volumetric neural rendering โ€“ including the body and background: In the example directly above, the real-life video on the right (woman in red dress) is used to'puppet' the captured identity (man in blue shirt) on the left via RigNeRF, which (the authors claim) is the first NeRF-based system to achieve separation of pose and expression while being able to perform novel view syntheses. The male figure on the left in the image above was'captured' from a 70-second smartphone video, and the input data (including the entire scene information) subsequently trained across 4 V100 GPUs to obtain the scene. Since 3DMM-style parametric rigs are also available as entire-body parametric CGI proxies (rather than just face rigs), RigNeRF potentially opens up the possibility of full-body deepfakes where real human movement, texture and expression is passed to the CGI-based parametric layer, which would then translate action and expression into rendered NeRF environments and videos. As for RigNeRF โ€“ does it qualify as a deepfake method in the current sense that the headlines understand the term? Or is it just another semi-hobbled also-ran to DeepFaceLab and other labor-intensive, 2017-era autoencoder deepfake systems?


Going deep on deep learning with Dr. Jianfeng Gao - Microsoft Research

#artificialintelligence

Dr. Jianfeng Gao is a veteran computer scientist, an IEEE Fellow and the current head of the Deep Learning Group at Microsoft Research. He and his team are exploring novel approaches to advancing the state-of-the-art on deep learning in areas like NLP, computer vision, multi-modal intelligence and conversational AI. Today, Dr. Gao gives us an overview of the deep learning landscape and talks about his latest work on Multi-task Deep Neural Networks, Unified Language Modeling and vision-language pre-training. He also unpacks the science behind task-oriented dialog systems as well as social chatbots like Microsoft Xiaoice, and gives us some great book recommendations along the way! Jianfeng Gao: Historically, there are two approaches to achieve the goal. One is to use large data. The idea is that if I can collect all the data in the world, then I believe the representation learned from this data is universal. Because I see all of them. The other approach is that, since the goal of this representation is to serve different applications, how about I train the model using application-specific objective functions across many, many different applications? Host: You're listening to the Microsoft Research Podcast, a show that brings you closer to the cutting-edge of technology research and the scientists behind it. Host: Dr. Jianfeng Gao is a veteran computer scientist, an IEEE Fellow and the current head of the Deep Learning Group at Microsoft Research. He and his team are exploring novel approaches to advancing the state-of-the-art on deep learning in areas like NLP, computer vision, multi-modal intelligence and conversational AI. Today, Dr. Gao gives us an overview of the deep learning landscape and talks about his latest work on Multi-task Deep Neural Networks, Unified Language Modeling and vision-language pre-training. He also unpacks the science behind task-oriented dialog systems as well as social chatbots like Microsoft Xiaoice, and gives us some great book recommendations along the way! What gets you up in the morning? Jianfeng Gao: It's like all the world-class research teams, our goal, ultimate goal, is to advance the state-of-the-art and we want to push the AI frontiers by using deep learning technology or developing new deep learning technologies.


Going deep on deep learning with Dr. Jianfeng Gao - Microsoft Research

#artificialintelligence

Dr. Jianfeng Gao is a veteran computer scientist, an IEEE Fellow and the current head of the Deep Learning Group at Microsoft Research. He and his team are exploring novel approaches to advancing the state-of-the-art on deep learning in areas like NLP, computer vision, multi-modal intelligence and conversational AI. Today, Dr. Gao gives us an overview of the deep learning landscape and talks about his latest work on Multi-task Deep Neural Networks, Unified Language Modeling and vision-language pre-training. He also unpacks the science behind task-oriented dialog systems as well as social chatbots like Microsoft Xiaoice, and gives us some great book recommendations along the way! Jianfeng Gao: Historically, there are two approaches to achieve the goal. One is to use large data. The idea is that if I can collect all the data in the world, then I believe the representation learned from this data is universal. Because I see all of them. The other approach is that, since the goal of this representation is to serve different applications, how about I train the model using application-specific objective functions across many, many different applications? Host: You're listening to the Microsoft Research Podcast, a show that brings you closer to the cutting-edge of technology research and the scientists behind it. Host: Dr. Jianfeng Gao is a veteran computer scientist, an IEEE Fellow and the current head of the Deep Learning Group at Microsoft Research. He and his team are exploring novel approaches to advancing the state-of-the-art on deep learning in areas like NLP, computer vision, multi-modal intelligence and conversational AI. Today, Dr. Gao gives us an overview of the deep learning landscape and talks about his latest work on Multi-task Deep Neural Networks, Unified Language Modeling and vision-language pre-training. He also unpacks the science behind task-oriented dialog systems as well as social chatbots like Microsoft Xiaoice, and gives us some great book recommendations along the way! What gets you up in the morning? Jianfeng Gao: It's like all the world-class research teams, our goal, ultimate goal, is to advance the state-of-the-art and we want to push the AI frontiers by using deep learning technology or developing new deep learning technologies.